Overview

Dataset statistics

Number of variables33
Number of observations130064
Missing cells0
Missing cells (%)0.0%
Duplicate rows1575
Duplicate rows (%)1.2%
Total size in memory32.7 MiB
Average record size in memory264.0 B

Variable types

Numeric12
Categorical21

Warnings

Dataset has 1575 (1.2%) duplicate rowsDuplicates
lat has a high cardinality: 54505 distinct values High cardinality
long has a high cardinality: 55046 distinct values High cardinality
Num_Acc is highly correlated with id_vehiculeHigh correlation
id_vehicule is highly correlated with Num_AccHigh correlation
agg is highly correlated with catr and 1 other fieldsHigh correlation
catr is highly correlated with agg and 1 other fieldsHigh correlation
vma is highly correlated with agg and 1 other fieldsHigh correlation
Num_Acc is highly correlated with id_vehiculeHigh correlation
id_vehicule is highly correlated with Num_AccHigh correlation
agg is highly correlated with catr and 1 other fieldsHigh correlation
atm is highly correlated with surfHigh correlation
catr is highly correlated with agg and 1 other fieldsHigh correlation
surf is highly correlated with atmHigh correlation
vma is highly correlated with agg and 1 other fieldsHigh correlation
Num_Acc is highly correlated with id_vehiculeHigh correlation
id_vehicule is highly correlated with Num_AccHigh correlation
agg is highly correlated with catr and 1 other fieldsHigh correlation
atm is highly correlated with surfHigh correlation
catr is highly correlated with agg and 1 other fieldsHigh correlation
surf is highly correlated with atmHigh correlation
vma is highly correlated with agg and 1 other fieldsHigh correlation
obs is highly correlated with obsmHigh correlation
vma is highly correlated with catr and 3 other fieldsHigh correlation
num_veh is highly correlated with colHigh correlation
atm is highly correlated with surfHigh correlation
surf is highly correlated with atmHigh correlation
id_vehicule is highly correlated with Num_AccHigh correlation
catr is highly correlated with vma and 2 other fieldsHigh correlation
Num_Acc is highly correlated with id_vehiculeHigh correlation
circ is highly correlated with vma and 2 other fieldsHigh correlation
place is highly correlated with obsm and 1 other fieldsHigh correlation
obsm is highly correlated with obs and 2 other fieldsHigh correlation
catv is highly correlated with secu1High correlation
agg is highly correlated with vma and 2 other fieldsHigh correlation
col is highly correlated with num_veh and 1 other fieldsHigh correlation
secu1 is highly correlated with catv and 1 other fieldsHigh correlation
catu is highly correlated with place and 1 other fieldsHigh correlation
lum is highly correlated with aggHigh correlation
nbv is highly correlated with vma and 1 other fieldsHigh correlation
obs is highly correlated with obsmHigh correlation
place is highly correlated with catuHigh correlation
obsm is highly correlated with obsHigh correlation
catu is highly correlated with placeHigh correlation
lat is uniformly distributed Uniform
long is uniformly distributed Uniform
nbv has 3427 (2.6%) zeros Zeros
infra has 108372 (83.3%) zeros Zeros

Reproduction

Analysis started2021-07-24 07:52:23.142249
Analysis finished2021-07-24 07:54:30.961332
Duration2 minutes and 7.82 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Num_Acc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct57586
Distinct (%)44.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.019000295 × 1011
Minimum2.019 × 1011
Maximum2.019000588 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:31.290452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2.019 × 1011
5-th percentile2.01900003 × 1011
Q12.019000148 × 1011
median2.019000294 × 1011
Q32.019000443 × 1011
95-th percentile2.019000559 × 1011
Maximum2.019000588 × 1011
Range58839
Interquartile range (IQR)29467.5

Descriptive statistics

Standard deviation16984.02483
Coefficient of variation (CV)8.41209626 × 10-8
Kurtosis-1.20187369
Mean2.019000295 × 1011
Median Absolute Deviation (MAD)14740
Skewness-0.0006488678133
Sum2.625992543 × 1016
Variance288457099.4
MonotonicityIncreasing
2021-07-24T09:54:31.634573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.019000296 × 101130
 
< 0.1%
2.019000497 × 101128
 
< 0.1%
2.019000271 × 101125
 
< 0.1%
2.019000576 × 101124
 
< 0.1%
2.019000522 × 101124
 
< 0.1%
2.019000546 × 101124
 
< 0.1%
2.019000414 × 101123
 
< 0.1%
2.019000499 × 101122
 
< 0.1%
2.019000429 × 101121
 
< 0.1%
2.019000216 × 101119
 
< 0.1%
Other values (57576)129824
99.8%
ValueCountFrequency (%)
2.019 × 10113
< 0.1%
2.019 × 10111
 
< 0.1%
2.019 × 10114
< 0.1%
2.019 × 10114
< 0.1%
2.019 × 10113
< 0.1%
2.019 × 10112
< 0.1%
2.019 × 10112
< 0.1%
2.019 × 10111
 
< 0.1%
2.019 × 10113
< 0.1%
2.019 × 10112
< 0.1%
ValueCountFrequency (%)
2.019000588 × 10112
< 0.1%
2.019000588 × 10111
 
< 0.1%
2.019000588 × 10111
 
< 0.1%
2.019000588 × 10113
< 0.1%
2.019000588 × 10113
< 0.1%
2.019000588 × 10114
< 0.1%
2.019000588 × 10114
< 0.1%
2.019000588 × 10111
 
< 0.1%
2.019000588 × 10111
 
< 0.1%
2.019000588 × 10111
 
< 0.1%

id_vehicule
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct96650
Distinct (%)74.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean138250880.3
Minimum137982129
Maximum138306525
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:31.999555image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum137982129
5-th percentile138200945
Q1138222930.8
median138250979.5
Q3138278784.2
95-th percentile138300862.8
Maximum138306525
Range324396
Interquartile range (IQR)55853.5

Descriptive statistics

Standard deviation32349.08443
Coefficient of variation (CV)0.0002339882708
Kurtosis-0.306917938
Mean138250880.3
Median Absolute Deviation (MAD)27939
Skewness-0.1088231655
Sum1.79814625 × 1013
Variance1046463263
MonotonicityNot monotonic
2021-07-24T09:54:32.326716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13825056630
 
< 0.1%
13821269128
 
< 0.1%
13825539325
 
< 0.1%
13821236121
 
< 0.1%
13822543620
 
< 0.1%
13826605419
 
< 0.1%
13827354218
 
< 0.1%
13822000718
 
< 0.1%
13825481918
 
< 0.1%
13823265317
 
< 0.1%
Other values (96640)129850
99.8%
ValueCountFrequency (%)
1379821291
 
< 0.1%
1379821301
 
< 0.1%
1379821311
 
< 0.1%
1379821321
 
< 0.1%
1379821331
 
< 0.1%
1379821341
 
< 0.1%
1379821351
 
< 0.1%
1379821361
 
< 0.1%
1379821372
< 0.1%
1379821383
< 0.1%
ValueCountFrequency (%)
1383065251
< 0.1%
1383065242
< 0.1%
1383065231
< 0.1%
1383065221
< 0.1%
1383065211
< 0.1%
1383065202
< 0.1%
1383065191
< 0.1%
1383065182
< 0.1%
1383065171
< 0.1%
1383065162
< 0.1%

num_veh
Categorical

HIGH CORRELATION

Distinct29
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
A01
78630 
B01
43597 
C01
 
5328
D01
 
1126
Z01
 
793
Other values (24)
 
590

Length

Max length4
Median length3
Mean length3.000038443
Min length3

Characters and Unicode

Total characters390197
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowB01
2nd rowB01
3rd rowA01
4th rowA01
5th rowA01

Common Values

ValueCountFrequency (%)
A0178630
60.5%
B0143597
33.5%
C015328
 
4.1%
D011126
 
0.9%
Z01793
 
0.6%
E01272
 
0.2%
F01101
 
0.1%
Y0161
 
< 0.1%
G0142
 
< 0.1%
H0120
 
< 0.1%
Other values (19)94
 
0.1%

Length

2021-07-24T09:54:33.120595image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a0178630
60.5%
b0143597
33.5%
c015328
 
4.1%
d011126
 
0.9%
z01793
 
0.6%
e01272
 
0.2%
f01101
 
0.1%
y0161
 
< 0.1%
g0142
 
< 0.1%
h0120
 
< 0.1%
Other values (19)94
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0130064
33.3%
1130064
33.3%
A78632
20.2%
B43599
 
11.2%
C5329
 
1.4%
D1126
 
0.3%
Z793
 
0.2%
E272
 
0.1%
F102
 
< 0.1%
Y61
 
< 0.1%
Other values (17)155
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Uppercase Letter130068
33.3%
Other Punctuation1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A78632
60.5%
B43599
33.5%
C5329
 
4.1%
D1126
 
0.9%
Z793
 
0.6%
E272
 
0.2%
F102
 
0.1%
Y61
 
< 0.1%
G42
 
< 0.1%
H20
 
< 0.1%
Other values (14)92
 
0.1%
Decimal Number
ValueCountFrequency (%)
0130064
50.0%
1130064
50.0%
Other Punctuation
ValueCountFrequency (%)
\1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common260129
66.7%
Latin130068
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A78632
60.5%
B43599
33.5%
C5329
 
4.1%
D1126
 
0.9%
Z793
 
0.6%
E272
 
0.2%
F102
 
0.1%
Y61
 
< 0.1%
G42
 
< 0.1%
H20
 
< 0.1%
Other values (14)92
 
0.1%
Common
ValueCountFrequency (%)
0130064
50.0%
1130064
50.0%
\1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII390197
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0130064
33.3%
1130064
33.3%
A78632
20.2%
B43599
 
11.2%
C5329
 
1.4%
D1126
 
0.3%
Z793
 
0.2%
E272
 
0.1%
F102
 
< 0.1%
Y61
 
< 0.1%
Other values (17)155
 
< 0.1%

place
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1
94966 
2
15230 
10
10779 
Other
 
9089

Length

Max length5
Median length1
Mean length1.362398512
Min length1

Characters and Unicode

Total characters177199
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
194966
73.0%
215230
 
11.7%
1010779
 
8.3%
Other9089
 
7.0%

Length

2021-07-24T09:54:33.693058image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:33.922413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
194966
73.0%
215230
 
11.7%
1010779
 
8.3%
other9089
 
7.0%

Most occurring characters

ValueCountFrequency (%)
1105745
59.7%
215230
 
8.6%
010779
 
6.1%
O9089
 
5.1%
t9089
 
5.1%
h9089
 
5.1%
e9089
 
5.1%
r9089
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131754
74.4%
Lowercase Letter36356
 
20.5%
Uppercase Letter9089
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t9089
25.0%
h9089
25.0%
e9089
25.0%
r9089
25.0%
Decimal Number
ValueCountFrequency (%)
1105745
80.3%
215230
 
11.6%
010779
 
8.2%
Uppercase Letter
ValueCountFrequency (%)
O9089
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common131754
74.4%
Latin45445
 
25.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
O9089
20.0%
t9089
20.0%
h9089
20.0%
e9089
20.0%
r9089
20.0%
Common
ValueCountFrequency (%)
1105745
80.3%
215230
 
11.6%
010779
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII177199
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1105745
59.7%
215230
 
8.6%
010779
 
6.1%
O9089
 
5.1%
t9089
 
5.1%
h9089
 
5.1%
e9089
 
5.1%
r9089
 
5.1%

catu
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1
95381 
2
23904 
3
10779 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters130064
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

Length

2021-07-24T09:54:34.448044image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:34.631516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

Most occurring characters

ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130064
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
Common130064
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII130064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
195381
73.3%
223904
 
18.4%
310779
 
8.3%

grav
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1
54057 
4
52062 
3
20497 
2
 
3448

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters130064
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row1
4th row4
5th row1

Common Values

ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

Length

2021-07-24T09:54:35.146136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:35.342612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

Most occurring characters

ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130064
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Common130064
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII130064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
154057
41.6%
452062
40.0%
320497
 
15.8%
23448
 
2.7%

sexe
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1
88398 
2
41666 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters130064
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
188398
68.0%
241666
32.0%

Length

2021-07-24T09:54:35.875186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:36.052712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
188398
68.0%
241666
32.0%

Most occurring characters

ValueCountFrequency (%)
188398
68.0%
241666
32.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130064
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
188398
68.0%
241666
32.0%

Most occurring scripts

ValueCountFrequency (%)
Common130064
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
188398
68.0%
241666
32.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII130064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
188398
68.0%
241666
32.0%

secu1
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
79141 
2.0
23929 
8.0
18823 
0.0
 
6719
Other
 
1452

Length

Max length5
Median length3
Mean length3.02232747
Min length3

Characters and Unicode

Total characters393096
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.079141
60.8%
2.023929
 
18.4%
8.018823
 
14.5%
0.06719
 
5.2%
Other1452
 
1.1%

Length

2021-07-24T09:54:36.980230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:37.211648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.079141
60.8%
2.023929
 
18.4%
8.018823
 
14.5%
0.06719
 
5.2%
other1452
 
1.1%

Most occurring characters

ValueCountFrequency (%)
0135331
34.4%
.128612
32.7%
179141
20.1%
223929
 
6.1%
818823
 
4.8%
O1452
 
0.4%
t1452
 
0.4%
h1452
 
0.4%
e1452
 
0.4%
r1452
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number257224
65.4%
Other Punctuation128612
32.7%
Lowercase Letter5808
 
1.5%
Uppercase Letter1452
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0135331
52.6%
179141
30.8%
223929
 
9.3%
818823
 
7.3%
Lowercase Letter
ValueCountFrequency (%)
t1452
25.0%
h1452
25.0%
e1452
25.0%
r1452
25.0%
Other Punctuation
ValueCountFrequency (%)
.128612
100.0%
Uppercase Letter
ValueCountFrequency (%)
O1452
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common385836
98.2%
Latin7260
 
1.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0135331
35.1%
.128612
33.3%
179141
20.5%
223929
 
6.2%
818823
 
4.9%
Latin
ValueCountFrequency (%)
O1452
20.0%
t1452
20.0%
h1452
20.0%
e1452
20.0%
r1452
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII393096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0135331
34.4%
.128612
32.7%
179141
20.1%
223929
 
6.1%
818823
 
4.8%
O1452
 
0.4%
t1452
 
0.4%
h1452
 
0.4%
e1452
 
0.4%
r1452
 
0.4%

senc
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
57832 
2.0
43681 
3.0
19210 
0.0
9341 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters390192
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.057832
44.5%
2.043681
33.6%
3.019210
 
14.8%
0.09341
 
7.2%

Length

2021-07-24T09:54:37.817989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:38.024436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.057832
44.5%
2.043681
33.6%
3.019210
 
14.8%
0.09341
 
7.2%

Most occurring characters

ValueCountFrequency (%)
0139405
35.7%
.130064
33.3%
157832
14.8%
243681
 
11.2%
319210
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Other Punctuation130064
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0139405
53.6%
157832
22.2%
243681
 
16.8%
319210
 
7.4%
Other Punctuation
ValueCountFrequency (%)
.130064
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common390192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0139405
35.7%
.130064
33.3%
157832
14.8%
243681
 
11.2%
319210
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII390192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0139405
35.7%
.130064
33.3%
157832
14.8%
243681
 
11.2%
319210
 
4.9%

catv
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
7
83492 
Other
28906 
33
9090 
10
8576 

Length

Max length5
Median length1
Mean length2.024803174
Min length1

Characters and Unicode

Total characters263354
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7
2nd row7
3rd rowOther
4th row7
5th row7

Common Values

ValueCountFrequency (%)
783492
64.2%
Other28906
 
22.2%
339090
 
7.0%
108576
 
6.6%

Length

2021-07-24T09:54:38.685704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:38.893149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
783492
64.2%
other28906
 
22.2%
339090
 
7.0%
108576
 
6.6%

Most occurring characters

ValueCountFrequency (%)
783492
31.7%
O28906
 
11.0%
t28906
 
11.0%
h28906
 
11.0%
e28906
 
11.0%
r28906
 
11.0%
318180
 
6.9%
18576
 
3.3%
08576
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number118824
45.1%
Lowercase Letter115624
43.9%
Uppercase Letter28906
 
11.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
783492
70.3%
318180
 
15.3%
18576
 
7.2%
08576
 
7.2%
Lowercase Letter
ValueCountFrequency (%)
t28906
25.0%
h28906
25.0%
e28906
25.0%
r28906
25.0%
Uppercase Letter
ValueCountFrequency (%)
O28906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin144530
54.9%
Common118824
45.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O28906
20.0%
t28906
20.0%
h28906
20.0%
e28906
20.0%
r28906
20.0%
Common
ValueCountFrequency (%)
783492
70.3%
318180
 
15.3%
18576
 
7.2%
08576
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII263354
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
783492
31.7%
O28906
 
11.0%
t28906
 
11.0%
h28906
 
11.0%
e28906
 
11.0%
r28906
 
11.0%
318180
 
6.9%
18576
 
3.3%
08576
 
3.3%

obs
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
0.0
109267 
Other
20797 

Length

Max length5
Median length3
Mean length3.319796408
Min length3

Characters and Unicode

Total characters431786
Distinct characters7
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd rowOther
4th rowOther
5th row0.0

Common Values

ValueCountFrequency (%)
0.0109267
84.0%
Other20797
 
16.0%

Length

2021-07-24T09:54:39.428682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:39.636128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0109267
84.0%
other20797
 
16.0%

Most occurring characters

ValueCountFrequency (%)
0218534
50.6%
.109267
25.3%
O20797
 
4.8%
t20797
 
4.8%
h20797
 
4.8%
e20797
 
4.8%
r20797
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number218534
50.6%
Other Punctuation109267
25.3%
Lowercase Letter83188
 
19.3%
Uppercase Letter20797
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t20797
25.0%
h20797
25.0%
e20797
25.0%
r20797
25.0%
Decimal Number
ValueCountFrequency (%)
0218534
100.0%
Other Punctuation
ValueCountFrequency (%)
.109267
100.0%
Uppercase Letter
ValueCountFrequency (%)
O20797
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common327801
75.9%
Latin103985
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O20797
20.0%
t20797
20.0%
h20797
20.0%
e20797
20.0%
r20797
20.0%
Common
ValueCountFrequency (%)
0218534
66.7%
.109267
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII431786
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0218534
50.6%
.109267
25.3%
O20797
 
4.8%
t20797
 
4.8%
h20797
 
4.8%
e20797
 
4.8%
r20797
 
4.8%

obsm
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
2.0
83443 
0.0
24752 
1.0
19920 
Other
 
1949

Length

Max length5
Median length3
Mean length3.029969861
Min length3

Characters and Unicode

Total characters394090
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row0.0
4th row0.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.083443
64.2%
0.024752
 
19.0%
1.019920
 
15.3%
Other1949
 
1.5%

Length

2021-07-24T09:54:40.196629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:40.410096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
2.083443
64.2%
0.024752
 
19.0%
1.019920
 
15.3%
other1949
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0152867
38.8%
.128115
32.5%
283443
21.2%
119920
 
5.1%
O1949
 
0.5%
t1949
 
0.5%
h1949
 
0.5%
e1949
 
0.5%
r1949
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number256230
65.0%
Other Punctuation128115
32.5%
Lowercase Letter7796
 
2.0%
Uppercase Letter1949
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t1949
25.0%
h1949
25.0%
e1949
25.0%
r1949
25.0%
Decimal Number
ValueCountFrequency (%)
0152867
59.7%
283443
32.6%
119920
 
7.8%
Other Punctuation
ValueCountFrequency (%)
.128115
100.0%
Uppercase Letter
ValueCountFrequency (%)
O1949
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common384345
97.5%
Latin9745
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
O1949
20.0%
t1949
20.0%
h1949
20.0%
e1949
20.0%
r1949
20.0%
Common
ValueCountFrequency (%)
0152867
39.8%
.128115
33.3%
283443
21.7%
119920
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII394090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0152867
38.8%
.128115
32.5%
283443
21.2%
119920
 
5.1%
O1949
 
0.5%
t1949
 
0.5%
h1949
 
0.5%
e1949
 
0.5%
r1949
 
0.5%

choc
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
47042 
Other
33004 
3.0
19767 
2.0
16899 
4.0
13352 

Length

Max length5
Median length3
Mean length3.507503998
Min length3

Characters and Unicode

Total characters456200
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowOther
3rd row3.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.047042
36.2%
Other33004
25.4%
3.019767
15.2%
2.016899
 
13.0%
4.013352
 
10.3%

Length

2021-07-24T09:54:41.051342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:41.277735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.047042
36.2%
other33004
25.4%
3.019767
15.2%
2.016899
 
13.0%
4.013352
 
10.3%

Most occurring characters

ValueCountFrequency (%)
.97060
21.3%
097060
21.3%
147042
10.3%
O33004
 
7.2%
t33004
 
7.2%
h33004
 
7.2%
e33004
 
7.2%
r33004
 
7.2%
319767
 
4.3%
216899
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number194120
42.6%
Lowercase Letter132016
28.9%
Other Punctuation97060
21.3%
Uppercase Letter33004
 
7.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
097060
50.0%
147042
24.2%
319767
 
10.2%
216899
 
8.7%
413352
 
6.9%
Lowercase Letter
ValueCountFrequency (%)
t33004
25.0%
h33004
25.0%
e33004
25.0%
r33004
25.0%
Uppercase Letter
ValueCountFrequency (%)
O33004
100.0%
Other Punctuation
ValueCountFrequency (%)
.97060
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common291180
63.8%
Latin165020
36.2%

Most frequent character per script

Common
ValueCountFrequency (%)
.97060
33.3%
097060
33.3%
147042
16.2%
319767
 
6.8%
216899
 
5.8%
413352
 
4.6%
Latin
ValueCountFrequency (%)
O33004
20.0%
t33004
20.0%
h33004
20.0%
e33004
20.0%
r33004
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII456200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.97060
21.3%
097060
21.3%
147042
10.3%
O33004
 
7.2%
t33004
 
7.2%
h33004
 
7.2%
e33004
 
7.2%
r33004
 
7.2%
319767
 
4.3%
216899
 
3.7%

manv
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
54636 
Other
50681 
2.0
14566 
15.0
10181 

Length

Max length5
Median length3
Mean length3.857600873
Min length3

Characters and Unicode

Total characters501735
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowOther
3rd rowOther
4th rowOther
5th row2.0

Common Values

ValueCountFrequency (%)
1.054636
42.0%
Other50681
39.0%
2.014566
 
11.2%
15.010181
 
7.8%

Length

2021-07-24T09:54:41.921012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:42.159374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.054636
42.0%
other50681
39.0%
2.014566
 
11.2%
15.010181
 
7.8%

Most occurring characters

ValueCountFrequency (%)
.79383
15.8%
079383
15.8%
164817
12.9%
O50681
10.1%
t50681
10.1%
h50681
10.1%
e50681
10.1%
r50681
10.1%
214566
 
2.9%
510181
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter202724
40.4%
Decimal Number168947
33.7%
Other Punctuation79383
 
15.8%
Uppercase Letter50681
 
10.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t50681
25.0%
h50681
25.0%
e50681
25.0%
r50681
25.0%
Decimal Number
ValueCountFrequency (%)
079383
47.0%
164817
38.4%
214566
 
8.6%
510181
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
O50681
100.0%
Other Punctuation
ValueCountFrequency (%)
.79383
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin253405
50.5%
Common248330
49.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
O50681
20.0%
t50681
20.0%
h50681
20.0%
e50681
20.0%
r50681
20.0%
Common
ValueCountFrequency (%)
.79383
32.0%
079383
32.0%
164817
26.1%
214566
 
5.9%
510181
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII501735
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.79383
15.8%
079383
15.8%
164817
12.9%
O50681
10.1%
t50681
10.1%
h50681
10.1%
e50681
10.1%
r50681
10.1%
214566
 
2.9%
510181
 
2.0%

mois
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.683917148
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:42.414691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.381499601
Coefficient of variation (CV)0.5059158464
Kurtosis-1.158325996
Mean6.683917148
Median Absolute Deviation (MAD)3
Skewness-0.06476184856
Sum869337
Variance11.43453955
MonotonicityNot monotonic
2021-07-24T09:54:42.649063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
712282
9.4%
612183
9.4%
1011762
9.0%
911602
8.9%
1211132
8.6%
1110826
8.3%
510639
8.2%
310317
7.9%
810313
7.9%
410150
7.8%
Other values (2)18858
14.5%
ValueCountFrequency (%)
19409
7.2%
29449
7.3%
310317
7.9%
410150
7.8%
510639
8.2%
612183
9.4%
712282
9.4%
810313
7.9%
911602
8.9%
1011762
9.0%
ValueCountFrequency (%)
1211132
8.6%
1110826
8.3%
1011762
9.0%
911602
8.9%
810313
7.9%
712282
9.4%
612183
9.4%
510639
8.2%
410150
7.8%
310317
7.9%

lum
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1
86683 
5
20575 
3
13524 
Other
9282 

Length

Max length5
Median length1
Mean length1.285459466
Min length1

Characters and Unicode

Total characters167192
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowOther
3rd rowOther
4th row3
5th row1

Common Values

ValueCountFrequency (%)
186683
66.6%
520575
 
15.8%
313524
 
10.4%
Other9282
 
7.1%

Length

2021-07-24T09:54:43.264422image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:43.492806image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
186683
66.6%
520575
 
15.8%
313524
 
10.4%
other9282
 
7.1%

Most occurring characters

ValueCountFrequency (%)
186683
51.8%
520575
 
12.3%
313524
 
8.1%
O9282
 
5.6%
t9282
 
5.6%
h9282
 
5.6%
e9282
 
5.6%
r9282
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number120782
72.2%
Lowercase Letter37128
 
22.2%
Uppercase Letter9282
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t9282
25.0%
h9282
25.0%
e9282
25.0%
r9282
25.0%
Decimal Number
ValueCountFrequency (%)
186683
71.8%
520575
 
17.0%
313524
 
11.2%
Uppercase Letter
ValueCountFrequency (%)
O9282
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common120782
72.2%
Latin46410
 
27.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
O9282
20.0%
t9282
20.0%
h9282
20.0%
e9282
20.0%
r9282
20.0%
Common
ValueCountFrequency (%)
186683
71.8%
520575
 
17.0%
313524
 
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII167192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
186683
51.8%
520575
 
12.3%
313524
 
8.1%
O9282
 
5.6%
t9282
 
5.6%
h9282
 
5.6%
e9282
 
5.6%
r9282
 
5.6%

dep
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
Other
91278 
Ile de france
27425 
Paris
11361 

Length

Max length13
Median length5
Mean length6.686861853
Min length5

Characters and Unicode

Total characters869720
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIle de france
2nd rowIle de france
3rd rowIle de france
4th rowIle de france
5th rowIle de france

Common Values

ValueCountFrequency (%)
Other91278
70.2%
Ile de france27425
 
21.1%
Paris11361
 
8.7%

Length

2021-07-24T09:54:44.140074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:44.343531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other91278
49.4%
france27425
 
14.8%
de27425
 
14.8%
ile27425
 
14.8%
paris11361
 
6.1%

Most occurring characters

ValueCountFrequency (%)
e173553
20.0%
r130064
15.0%
O91278
10.5%
t91278
10.5%
h91278
10.5%
54850
 
6.3%
a38786
 
4.5%
I27425
 
3.2%
l27425
 
3.2%
d27425
 
3.2%
Other values (6)116358
13.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter684806
78.7%
Uppercase Letter130064
 
15.0%
Space Separator54850
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e173553
25.3%
r130064
19.0%
t91278
13.3%
h91278
13.3%
a38786
 
5.7%
l27425
 
4.0%
d27425
 
4.0%
f27425
 
4.0%
n27425
 
4.0%
c27425
 
4.0%
Other values (2)22722
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
O91278
70.2%
I27425
 
21.1%
P11361
 
8.7%
Space Separator
ValueCountFrequency (%)
54850
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin814870
93.7%
Common54850
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e173553
21.3%
r130064
16.0%
O91278
11.2%
t91278
11.2%
h91278
11.2%
a38786
 
4.8%
I27425
 
3.4%
l27425
 
3.4%
d27425
 
3.4%
f27425
 
3.4%
Other values (5)88933
10.9%
Common
ValueCountFrequency (%)
54850
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII869720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e173553
20.0%
r130064
15.0%
O91278
10.5%
t91278
10.5%
h91278
10.5%
54850
 
6.3%
a38786
 
4.5%
I27425
 
3.2%
l27425
 
3.2%
d27425
 
3.2%
Other values (6)116358
13.4%

agg
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
2
80534 
1
49530 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters130064
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
280534
61.9%
149530
38.1%

Length

2021-07-24T09:54:44.848180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:45.029696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
280534
61.9%
149530
38.1%

Most occurring characters

ValueCountFrequency (%)
280534
61.9%
149530
38.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number130064
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
280534
61.9%
149530
38.1%

Most occurring scripts

ValueCountFrequency (%)
Common130064
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
280534
61.9%
149530
38.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII130064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
280534
61.9%
149530
38.1%

int
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.005166687
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:45.186311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile7
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.972474854
Coefficient of variation (CV)0.9836962019
Kurtosis5.145903927
Mean2.005166687
Median Absolute Deviation (MAD)0
Skewness2.409376989
Sum260800
Variance3.890657051
MonotonicityNot monotonic
2021-07-24T09:54:45.414668image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
185566
65.8%
216071
 
12.4%
313382
 
10.3%
95710
 
4.4%
64389
 
3.4%
42743
 
2.1%
71421
 
1.1%
5651
 
0.5%
8131
 
0.1%
ValueCountFrequency (%)
185566
65.8%
216071
 
12.4%
313382
 
10.3%
42743
 
2.1%
5651
 
0.5%
64389
 
3.4%
71421
 
1.1%
8131
 
0.1%
95710
 
4.4%
ValueCountFrequency (%)
95710
 
4.4%
8131
 
0.1%
71421
 
1.1%
64389
 
3.4%
5651
 
0.5%
42743
 
2.1%
313382
 
10.3%
216071
 
12.4%
185566
65.8%

atm
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.612306249
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:45.679955image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile7
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.669200661
Coefficient of variation (CV)1.035287596
Kurtosis8.961533638
Mean1.612306249
Median Absolute Deviation (MAD)0
Skewness3.167991263
Sum209703
Variance2.786230847
MonotonicityNot monotonic
2021-07-24T09:54:45.887400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1103079
79.3%
214413
 
11.1%
84805
 
3.7%
33159
 
2.4%
72292
 
1.8%
5720
 
0.6%
9629
 
0.5%
4613
 
0.5%
6354
 
0.3%
ValueCountFrequency (%)
1103079
79.3%
214413
 
11.1%
33159
 
2.4%
4613
 
0.5%
5720
 
0.6%
6354
 
0.3%
72292
 
1.8%
84805
 
3.7%
9629
 
0.5%
ValueCountFrequency (%)
9629
 
0.5%
84805
 
3.7%
72292
 
1.8%
6354
 
0.3%
5720
 
0.6%
4613
 
0.5%
33159
 
2.4%
214413
 
11.1%
1103079
79.3%

col
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.881858162
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:46.176627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median3
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.863094927
Coefficient of variation (CV)0.4799492534
Kurtosis-1.283105568
Mean3.881858162
Median Absolute Deviation (MAD)1
Skewness0.1360574179
Sum504890
Variance3.471122706
MonotonicityNot monotonic
2021-07-24T09:54:46.402024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
339449
30.3%
633711
25.9%
217649
13.6%
113785
 
10.6%
49160
 
7.0%
78502
 
6.5%
57808
 
6.0%
ValueCountFrequency (%)
113785
 
10.6%
217649
13.6%
339449
30.3%
49160
 
7.0%
57808
 
6.0%
633711
25.9%
78502
 
6.5%
ValueCountFrequency (%)
78502
 
6.5%
633711
25.9%
57808
 
6.0%
49160
 
7.0%
339449
30.3%
217649
13.6%
113785
 
10.6%

lat
Categorical

HIGH CARDINALITY
UNIFORM

Distinct54505
Distinct (%)41.9%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
48,8100000
 
48
43,1213200
 
47
48,7900000
 
45
48,9100000
 
33
43,1334900
 
33
Other values (54500)
129858 

Length

Max length12
Median length10
Mean length10.04483178
Min length9

Characters and Unicode

Total characters1306471
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10555 ?
Unique (%)8.1%

Sample

1st row48,8962100
2nd row48,8962100
3rd row48,8962100
4th row48,9307000
5th row48,9358718

Common Values

ValueCountFrequency (%)
48,810000048
 
< 0.1%
43,121320047
 
< 0.1%
48,790000045
 
< 0.1%
48,910000033
 
< 0.1%
43,133490033
 
< 0.1%
43,157885030
 
< 0.1%
49,878407028
 
< 0.1%
43,143400028
 
< 0.1%
48,820000028
 
< 0.1%
15,993570025
 
< 0.1%
Other values (54495)129719
99.7%

Length

2021-07-24T09:54:47.081245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
48,810000048
 
< 0.1%
43,121320047
 
< 0.1%
48,790000045
 
< 0.1%
43,133490033
 
< 0.1%
48,910000033
 
< 0.1%
43,157885030
 
< 0.1%
48,820000028
 
< 0.1%
49,878407028
 
< 0.1%
43,143400028
 
< 0.1%
15,993570025
 
< 0.1%
Other values (54495)129719
99.7%

Most occurring characters

ValueCountFrequency (%)
0249197
19.1%
4198217
15.2%
8138766
10.6%
,130064
10.0%
390964
 
7.0%
788350
 
6.8%
987001
 
6.7%
586854
 
6.6%
681842
 
6.3%
276568
 
5.9%
Other values (3)78648
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1169641
89.5%
Other Punctuation130064
 
10.0%
Space Separator3383
 
0.3%
Dash Punctuation3383
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0249197
21.3%
4198217
16.9%
8138766
11.9%
390964
 
7.8%
788350
 
7.6%
987001
 
7.4%
586854
 
7.4%
681842
 
7.0%
276568
 
6.5%
171882
 
6.1%
Other Punctuation
ValueCountFrequency (%)
,130064
100.0%
Space Separator
ValueCountFrequency (%)
3383
100.0%
Dash Punctuation
ValueCountFrequency (%)
-3383
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1306471
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0249197
19.1%
4198217
15.2%
8138766
10.6%
,130064
10.0%
390964
 
7.0%
788350
 
6.8%
987001
 
6.7%
586854
 
6.6%
681842
 
6.3%
276568
 
5.9%
Other values (3)78648
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1306471
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0249197
19.1%
4198217
15.2%
8138766
10.6%
,130064
10.0%
390964
 
7.0%
788350
 
6.8%
987001
 
6.7%
586854
 
6.6%
681842
 
6.3%
276568
 
5.9%
Other values (3)78648
 
6.0%

long
Categorical

HIGH CARDINALITY
UNIFORM

Distinct55046
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
5,9533100
 
47
2,4900000
 
35
5,9847600
 
33
2,8563670
 
30
2,4400000
 
30
Other values (55041)
129889 

Length

Max length13
Median length9
Mean length9.398134764
Min length9

Characters and Unicode

Total characters1222359
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10695 ?
Unique (%)8.2%

Sample

1st row2,4701200
2nd row2,4701200
3rd row2,4701200
4th row2,3688000
5th row2,3191744

Common Values

ValueCountFrequency (%)
5,953310047
 
< 0.1%
2,490000035
 
< 0.1%
5,984760033
 
< 0.1%
2,856367030
 
< 0.1%
2,440000030
 
< 0.1%
2,837339028
 
< 0.1%
6,014200028
 
< 0.1%
2,346760027
 
< 0.1%
5,400000026
 
< 0.1%
-61,725090025
 
< 0.1%
Other values (55036)129755
99.8%

Length

2021-07-24T09:54:47.860123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
5,953310047
 
< 0.1%
2,490000035
 
< 0.1%
5,984760033
 
< 0.1%
2,856367030
 
< 0.1%
2,440000030
 
< 0.1%
2,837339028
 
< 0.1%
6,014200028
 
< 0.1%
2,346760027
 
< 0.1%
5,400000026
 
< 0.1%
61,725090025
 
< 0.1%
Other values (54923)129755
99.8%

Most occurring characters

ValueCountFrequency (%)
0262385
21.5%
,130064
10.6%
2122514
10.0%
495015
 
7.8%
393663
 
7.7%
590265
 
7.4%
189245
 
7.3%
679920
 
6.5%
773837
 
6.0%
872078
 
5.9%
Other values (3)113373
9.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1048159
85.7%
Other Punctuation130064
 
10.6%
Space Separator22068
 
1.8%
Dash Punctuation22068
 
1.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0262385
25.0%
2122514
11.7%
495015
 
9.1%
393663
 
8.9%
590265
 
8.6%
189245
 
8.5%
679920
 
7.6%
773837
 
7.0%
872078
 
6.9%
969237
 
6.6%
Other Punctuation
ValueCountFrequency (%)
,130064
100.0%
Space Separator
ValueCountFrequency (%)
22068
100.0%
Dash Punctuation
ValueCountFrequency (%)
-22068
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1222359
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0262385
21.5%
,130064
10.6%
2122514
10.0%
495015
 
7.8%
393663
 
7.7%
590265
 
7.4%
189245
 
7.3%
679920
 
6.5%
773837
 
6.0%
872078
 
5.9%
Other values (3)113373
9.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1222359
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0262385
21.5%
,130064
10.6%
2122514
10.0%
495015
 
7.8%
393663
 
7.7%
590265
 
7.4%
189245
 
7.3%
679920
 
6.5%
773837
 
6.0%
872078
 
5.9%
Other values (3)113373
9.3%

catr
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.257188769
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:48.126453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median3
Q34
95-th percentile4
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.277307845
Coefficient of variation (CV)0.3921503897
Kurtosis3.807024033
Mean3.257188769
Median Absolute Deviation (MAD)1
Skewness0.7871381834
Sum423643
Variance1.631515331
MonotonicityNot monotonic
2021-07-24T09:54:48.339892image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
453658
41.3%
346471
35.7%
115686
 
12.1%
29485
 
7.3%
72895
 
2.2%
91205
 
0.9%
6512
 
0.4%
5152
 
0.1%
ValueCountFrequency (%)
115686
 
12.1%
29485
 
7.3%
346471
35.7%
453658
41.3%
5152
 
0.1%
6512
 
0.4%
72895
 
2.2%
91205
 
0.9%
ValueCountFrequency (%)
91205
 
0.9%
72895
 
2.2%
6512
 
0.4%
5152
 
0.1%
453658
41.3%
346471
35.7%
29485
 
7.3%
115686
 
12.1%

circ
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
2.0
85498 
1.0
22635 
3.0
21118 
4.0
 
813

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters390192
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row3.0
3rd row3.0
4th row1.0
5th row3.0

Common Values

ValueCountFrequency (%)
2.085498
65.7%
1.022635
 
17.4%
3.021118
 
16.2%
4.0813
 
0.6%

Length

2021-07-24T09:54:48.921285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:49.115764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
2.085498
65.7%
1.022635
 
17.4%
3.021118
 
16.2%
4.0813
 
0.6%

Most occurring characters

ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
285498
21.9%
122635
 
5.8%
321118
 
5.4%
4813
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Other Punctuation130064
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0130064
50.0%
285498
32.9%
122635
 
8.7%
321118
 
8.1%
4813
 
0.3%
Other Punctuation
ValueCountFrequency (%)
.130064
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common390192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
285498
21.9%
122635
 
5.8%
321118
 
5.4%
4813
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII390192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
285498
21.9%
122635
 
5.8%
321118
 
5.4%
4813
 
0.2%

nbv
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.51410075
Minimum0
Maximum12
Zeros3427
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:49.364100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q33
95-th percentile6
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.446391638
Coefficient of variation (CV)0.5753117244
Kurtosis6.204542419
Mean2.51410075
Median Absolute Deviation (MAD)0
Skewness2.014683938
Sum326994
Variance2.09204877
MonotonicityNot monotonic
2021-07-24T09:54:49.647345image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
278082
60.0%
416816
 
12.9%
111778
 
9.1%
310641
 
8.2%
64017
 
3.1%
03427
 
2.6%
52599
 
2.0%
81326
 
1.0%
7527
 
0.4%
10407
 
0.3%
Other values (3)444
 
0.3%
ValueCountFrequency (%)
03427
 
2.6%
111778
 
9.1%
278082
60.0%
310641
 
8.2%
416816
 
12.9%
52599
 
2.0%
64017
 
3.1%
7527
 
0.4%
81326
 
1.0%
9279
 
0.2%
ValueCountFrequency (%)
1275
 
0.1%
1190
 
0.1%
10407
 
0.3%
9279
 
0.2%
81326
 
1.0%
7527
 
0.4%
64017
 
3.1%
52599
 
2.0%
416816
12.9%
310641
8.2%

vosp
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
0.0
120392 
1.0
 
4035
3.0
 
3762
2.0
 
1875

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters390192
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0120392
92.6%
1.04035
 
3.1%
3.03762
 
2.9%
2.01875
 
1.4%

Length

2021-07-24T09:54:50.234807image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:50.458210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0.0120392
92.6%
1.04035
 
3.1%
3.03762
 
2.9%
2.01875
 
1.4%

Most occurring characters

ValueCountFrequency (%)
0250456
64.2%
.130064
33.3%
14035
 
1.0%
33762
 
1.0%
21875
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Other Punctuation130064
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0250456
96.3%
14035
 
1.6%
33762
 
1.4%
21875
 
0.7%
Other Punctuation
ValueCountFrequency (%)
.130064
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common390192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0250456
64.2%
.130064
33.3%
14035
 
1.0%
33762
 
1.0%
21875
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII390192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0250456
64.2%
.130064
33.3%
14035
 
1.0%
33762
 
1.0%
21875
 
0.5%

prof
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
104896 
2.0
20803 
3.0
 
2312
4.0
 
2053

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters390192
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row4.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0104896
80.6%
2.020803
 
16.0%
3.02312
 
1.8%
4.02053
 
1.6%

Length

2021-07-24T09:54:50.982805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:51.158343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0104896
80.6%
2.020803
 
16.0%
3.02312
 
1.8%
4.02053
 
1.6%

Most occurring characters

ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1104896
26.9%
220803
 
5.3%
32312
 
0.6%
42053
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Other Punctuation130064
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0130064
50.0%
1104896
40.3%
220803
 
8.0%
32312
 
0.9%
42053
 
0.8%
Other Punctuation
ValueCountFrequency (%)
.130064
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common390192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1104896
26.9%
220803
 
5.3%
32312
 
0.6%
42053
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII390192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1104896
26.9%
220803
 
5.3%
32312
 
0.6%
42053
 
0.5%

plan
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1016.2 KiB
1.0
106490 
2.0
11069 
3.0
10899 
4.0
 
1606

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters390192
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row3.0

Common Values

ValueCountFrequency (%)
1.0106490
81.9%
2.011069
 
8.5%
3.010899
 
8.4%
4.01606
 
1.2%

Length

2021-07-24T09:54:51.703841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-24T09:54:51.904304image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0106490
81.9%
2.011069
 
8.5%
3.010899
 
8.4%
4.01606
 
1.2%

Most occurring characters

ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1106490
27.3%
211069
 
2.8%
310899
 
2.8%
41606
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number260128
66.7%
Other Punctuation130064
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0130064
50.0%
1106490
40.9%
211069
 
4.3%
310899
 
4.2%
41606
 
0.6%
Other Punctuation
ValueCountFrequency (%)
.130064
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common390192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1106490
27.3%
211069
 
2.8%
310899
 
2.8%
41606
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII390192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.130064
33.3%
0130064
33.3%
1106490
27.3%
211069
 
2.8%
310899
 
2.8%
41606
 
0.4%

surf
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.267652848
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:52.131698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8101430081
Coefficient of variation (CV)0.6390890136
Kurtosis52.60434636
Mean1.267652848
Median Absolute Deviation (MAD)0
Skewness6.489757104
Sum164876
Variance0.6563316936
MonotonicityNot monotonic
2021-07-24T09:54:52.347162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1104116
80.0%
224133
 
18.6%
9604
 
0.5%
7431
 
0.3%
5255
 
0.2%
3228
 
0.2%
8195
 
0.1%
657
 
< 0.1%
445
 
< 0.1%
ValueCountFrequency (%)
1104116
80.0%
224133
 
18.6%
3228
 
0.2%
445
 
< 0.1%
5255
 
0.2%
657
 
< 0.1%
7431
 
0.3%
8195
 
0.1%
9604
 
0.5%
ValueCountFrequency (%)
9604
 
0.5%
8195
 
0.1%
7431
 
0.3%
657
 
< 0.1%
5255
 
0.2%
445
 
< 0.1%
3228
 
0.2%
224133
 
18.6%
1104116
80.0%

infra
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8636132981
Minimum0
Maximum9
Zeros108372
Zeros (%)83.3%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:52.613407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile6
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.204630496
Coefficient of variation (CV)2.552798227
Kurtosis5.753888973
Mean0.8636132981
Median Absolute Deviation (MAD)0
Skewness2.607570011
Sum112325
Variance4.860395626
MonotonicityNot monotonic
2021-07-24T09:54:52.832857image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0108372
83.3%
57641
 
5.9%
94715
 
3.6%
22651
 
2.0%
32020
 
1.6%
11792
 
1.4%
81124
 
0.9%
61123
 
0.9%
4527
 
0.4%
799
 
0.1%
ValueCountFrequency (%)
0108372
83.3%
11792
 
1.4%
22651
 
2.0%
32020
 
1.6%
4527
 
0.4%
57641
 
5.9%
61123
 
0.9%
799
 
0.1%
81124
 
0.9%
94715
 
3.6%
ValueCountFrequency (%)
94715
 
3.6%
81124
 
0.9%
799
 
0.1%
61123
 
0.9%
57641
 
5.9%
4527
 
0.4%
32020
 
1.6%
22651
 
2.0%
11792
 
1.4%
0108372
83.3%

situ
Real number (ℝ≥0)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.327200455
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:53.081157image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum8
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.117083132
Coefficient of variation (CV)0.8416838073
Kurtosis19.71290706
Mean1.327200455
Median Absolute Deviation (MAD)0
Skewness4.270393668
Sum172621
Variance1.247874724
MonotonicityNot monotonic
2021-07-24T09:54:53.321512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1115568
88.9%
37488
 
5.8%
22012
 
1.5%
81959
 
1.5%
41154
 
0.9%
51021
 
0.8%
6862
 
0.7%
ValueCountFrequency (%)
1115568
88.9%
22012
 
1.5%
37488
 
5.8%
41154
 
0.9%
51021
 
0.8%
6862
 
0.7%
81959
 
1.5%
ValueCountFrequency (%)
81959
 
1.5%
6862
 
0.7%
51021
 
0.8%
41154
 
0.9%
37488
 
5.8%
22012
 
1.5%
1115568
88.9%

vma
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.36531246
Minimum10
Maximum130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1016.2 KiB
2021-07-24T09:54:53.598805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile30
Q150
median50
Q380
95-th percentile110
Maximum130
Range120
Interquartile range (IQR)30

Descriptive statistics

Standard deviation22.26504736
Coefficient of variation (CV)0.3570101148
Kurtosis1.010881015
Mean62.36531246
Median Absolute Deviation (MAD)0
Skewness1.094829439
Sum8111482
Variance495.7323339
MonotonicityNot monotonic
2021-07-24T09:54:53.854126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
5072535
55.8%
8020206
 
15.5%
709885
 
7.6%
908710
 
6.7%
307965
 
6.1%
1105638
 
4.3%
1303713
 
2.9%
60481
 
0.4%
10343
 
0.3%
20282
 
0.2%
Other values (10)306
 
0.2%
ValueCountFrequency (%)
10343
 
0.3%
122
 
< 0.1%
1561
 
< 0.1%
20282
 
0.2%
2519
 
< 0.1%
307965
6.1%
3511
 
< 0.1%
40120
 
0.1%
424
 
< 0.1%
4553
 
< 0.1%
ValueCountFrequency (%)
1303713
 
2.9%
1202
 
< 0.1%
1105638
 
4.3%
10030
 
< 0.1%
908710
 
6.7%
8020206
 
15.5%
709885
 
7.6%
654
 
< 0.1%
60481
 
0.4%
5072535
55.8%

Interactions

2021-07-24T09:53:24.945978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:25.295038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:25.597229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:25.869500image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:26.158694image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:26.482854image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:26.759085image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:27.054295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:27.317626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:27.598878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:27.852201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:28.128423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:28.394749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:28.726822image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:29.161659image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:29.546630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:29.960521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:30.425315image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:30.850177image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:31.203199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:31.576200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:31.985108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:32.312232image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:32.628385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:32.922599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:33.235798image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:33.567872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:33.859092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:34.156297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:34.464515image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:34.787606image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:35.271314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:35.614396image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:35.993418image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:36.278618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:36.618751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:36.916915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:37.275950image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:37.628011image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:37.911297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:38.204466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:38.519623image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:38.792892image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:39.111042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:39.391326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:39.763299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:40.143316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:40.462466image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:40.772633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:41.118671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:41.449823image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:41.759001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:42.063147image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:42.403233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:42.739371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:43.083450image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:43.407585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:43.743691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:44.020947image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:44.317153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:44.588390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:44.864650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:45.202747image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:45.485988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:45.789179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:46.102340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:46.377641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:46.693797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:46.978034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:47.448777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:47.769878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:48.136937image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:48.530843image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:48.998590image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:49.420462image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:49.781496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:50.208395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:50.708017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:51.102964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:51.572705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:52.096308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:52.622933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:52.991906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:53.555400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:54.047118image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:54.465965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:55.020494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:55.382512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:55.800395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:56.268142image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:56.679044image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:57.316341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:57.889806image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:58.413405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:59.092587image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:53:59.777754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:00.379145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:00.986523image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:01.882124image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:02.538370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:03.373136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:04.039353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:04.634760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:05.382760image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:05.981161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:06.489797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:06.921642image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:07.476159image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:07.930942image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:08.369808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:08.827543image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:09.280333image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:09.741100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:10.187943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:10.692558image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:11.153323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:11.866414image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:12.276321image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:12.632365image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:13.105101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:13.516001image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:13.988736image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:14.517322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:14.976095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:15.459801image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:16.033267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:16.558861image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:17.041569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:17.482391image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:17.948144image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:18.360042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:18.904585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:19.421204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:19.963752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:20.438516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:20.808492image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:21.188476image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:21.589404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:21.974375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:22.342393image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:22.692452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:23.051491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:23.377620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-24T09:54:23.725730image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-24T09:54:54.189196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-24T09:54:55.382003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-24T09:54:56.095096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-24T09:54:56.863074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-24T09:54:57.774637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-24T09:54:24.533530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-24T09:54:29.040469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Num_Accid_vehiculenum_vehplacecatugravsexesecu1senccatvobsobsmchocmanvmoislumdepaggintatmcollatlongcatrcircnbvvospprofplansurfinfrasituvma
0201900000001138306524B0122421.02.070.02.0OtherOther11OtherIle de france111.02.048,89621002,470120013.010.00.01.02.01.02.01.070
1201900000001138306524B0111421.02.070.02.0OtherOther11OtherIle de france111.02.048,89621002,470120013.010.00.01.02.01.02.01.070
2201900000001138306525A0111111.02.0OtherOther0.03.0Other11OtherIle de france111.02.048,89621002,470120013.010.00.01.02.01.02.01.070
3201900000002138306523A0111421.01.07Other0.01.0Other113Ile de france111.06.048,93070002,368800011.02.00.04.02.01.00.01.070
4201900000003138306520A0111111.01.070.02.01.02.0111Ile de france111.04.048,93587182,319174413.08.00.01.03.01.00.01.090
5201900000003138306520A0122421.01.070.02.01.02.0111Ile de france111.04.048,93587182,319174413.08.00.01.03.01.00.01.090
6201900000003138306521B0111411.01.07Other0.04.02.0111Ile de france111.04.048,93587182,319174413.08.00.01.03.01.00.01.090
7201900000003138306522C0111111.01.070.02.04.0Other111Ile de france111.04.048,93587182,319174413.08.00.01.03.01.00.01.090
8201900000004138306517A0111111.02.070.02.04.0Other115Ile de france111.04.048,81732952,428150213.05.00.01.01.01.00.01.090
9201900000004138306518B0111111.02.070.02.04.0Other115Ile de france111.04.048,81732952,428150213.05.00.01.01.01.00.01.090

Last rows

Num_Accid_vehiculenum_vehplacecatugravsexesecu1senccatvobsobsmchocmanvmoislumdepaggintatmcollatlongcatrcircnbvvospprofplansurfinfrasituvma
130054201900058836137982136B0111411.02.070.02.0OtherOther111Other115.02.045,66666005,056120011.03.00.01.01.02.03.01.0130
130055201900058836137982137A0111411.02.07Other2.01.0Other111Other115.02.045,66666005,056120011.03.00.01.01.02.03.01.0130
130056201900058836137982137A0122411.02.07Other2.01.0Other111Other115.02.045,66666005,056120011.03.00.01.01.02.03.01.0130
130057201900058837137982133A0111121.01.070.02.04.02.0111Other118.04.048,57690007,726900011.02.00.01.01.02.00.01.090
130058201900058837137982134B0111411.01.070.02.01.02.0111Other118.04.048,57690007,726900011.02.00.01.01.02.00.01.090
130059201900058837137982135C0111421.01.070.02.01.02.0111Other118.04.048,57690007,726900011.02.00.01.01.02.00.01.090
130060201900058838137982132A0111411.02.07Other0.01.0Other11OtherIle de france111.06.048,77170002,345760013.03.00.01.01.01.00.01.090
130061201900058839137982131A0111312.02.0330.00.0Other1.0111Ile de france111.07.048,77728902,223759011.01.00.01.03.01.00.01.050
130062201900058840137982129B0111411.01.0100.02.04.0Other113Ile de france111.02.048,83512362,175110111.03.00.01.01.01.00.02.0110
130063201900058840137982130A0111111.01.0100.02.01.0Other113Ile de france111.02.048,83512362,175110111.03.00.01.01.01.00.02.0110

Duplicate rows

Most frequently occurring

Num_Accid_vehiculenum_vehplacecatugravsexesecu1senccatvobsobsmchocmanvmoislumdepaggintatmcollatlongcatrcircnbvvospprofplansurfinfrasituvma# duplicates
710201900027084138255393A01103428.02.010Other1.02.0Other35Other212.06.015,9935700-61,725090042.02.00.02.01.02.00.01.05013
716201900027368138254819A01Other2110.03.0OtherOther0.0OtherOther111Other118.06.044,31644501,715502092.02.00.02.02.01.00.03.05012
1451201900054644138203344B01Other2428.00.0Other0.02.01.01.0121Ile de france211.01.048,75973002,346760092.02.03.01.03.07.02.01.05012
72201900003323138300270A01Other2111.01.0Other0.02.03.0Other61Other118.01.045,23392071,425277632.02.00.01.01.01.00.01.09011
1143201900043950138223560B01Other2428.02.0Other0.02.03.015.061Other261.06.014,6140500-61,051810012.04.03.01.01.01.03.01.05010
1329201900049861138212361B01Other2121.01.0Other0.02.03.01.011Other111.01.046,42343863,976915232.02.00.01.03.01.00.01.08010
556201900021572138266054A01Other2428.01.0OtherOther0.03.01.081Other111.06.044,30033003,313588032.02.00.01.01.01.00.03.0809
709201900027084138255393A01103418.02.010Other1.02.0Other35Other212.06.015,9935700-61,725090042.02.00.02.01.02.00.01.0509
555201900021572138266054A01Other2418.01.0OtherOther0.03.01.081Other111.06.044,30033003,313588032.02.00.01.01.01.00.03.0808
1009201900039069138232653B01Other2121.02.0Other0.00.04.02.015Other112.02.043,54247903,789300011.01.00.01.01.02.03.01.0308